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DETAILED ACTION 



Drawings 

This application has been filed with informal drawings, which are acceptable for 
examination purposes only. Formal drawings will be required when the application is 
allowed. 



The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

Claims 1, 5 to 7, 9, 15 and 20 are rejected under 35 U.S.C. 102(a) as being 
anticipated by FerrelL 

Regarding independent claim 1, Ferrell discloses an interactive speech and 
language training system, comprising: 

"a first module configured to convert input text to audible speech in a selected 
language, the audible speech being patterned after a model" - speech synthesizer 74 
forms an audio representation of the vocabulary elements; vocabulary library 68 
includes recorded digitized representations of vocabulary elements ("models") (column 
7, lines 40 to 45: Figure 3); a vocabulary element, such as a word or phrase is 



Claim Rejections - 35 USC § 102 



# 
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presented both visually and aurally to the individual in a native language or a non-native 
language (column 4, lines 33 to 57: Figures 1 and 4); 

"a user interface configured to receive utterances spoken by a user in response 
to a prompt to replicate the audible speech" - a vocabulary element is presented both 
visually and aurally to the individual ("a prompt"), and the individual is given a period of 
time to initiate a response; the user's response is received; for example, the user may 
pronounce the vocabulary element (column 4, line 44 to column 5, line 10: Figure 1 : 
Steps 12 to 14; Figure 4); 

"a second module configured to recognize the utterances and provide feedback 
to the user, the feedback being comprised of a confidence measure reflecting a 
precision at which the user replicates the audible speech in the selected language 
based on a comparison of the utterances to one of the audible speech and the model" - 
the responses are evaluated for correctness and appropriate feedback is presented to 
the user based on the correctness of the response; in the preferred embodiment, the 
feedback includes both visual and aural feedback; visual feedback is provided by a 
needle gauge at the bottom of the screen which indicates the degree of correct 
pronunciation ("confidence measure")(column 5, lines 8 to 25: Figure 1; Steps 18 and 
20); icon 84 provides visual feedback in the form of a confidence meter which indicates 
the correctness of a user response (column 8, lines 1 to 3: Figure 4). 



Regarding claims 5 and 6, Ferrell discloses vocabulary library 68 ("files for 
storing model pronunciations") includes digital representations of vocabulary elements 



Application/Control Number: 09/392,844 



Page 4 



Art Unit: 2654 

(column 7, lines 40 to 45: Figure 3); a vocabulary element may be a phoneme 
("phoneme model"), word, sentence, or paragraph; the aural presentation preferably 
includes a synthesized utterance corresponding to the vocabulary element; the user 
may pronounce the vocabulary element (column 4, line 40 to column 5, line 10: Figure 
1: Steps 12 to 14). 

Regarding claim 7, Ferrell discloses the presentation is divided into multiple 
lessons incorporating new vocabulary elements (column 4, lines 55 to 57; column 5, 
lines 26 to 36: Figure 2). 

Regarding claim 9, Ferrell discloses unfamiliar vocabulary elements are 
introduced with a definition ("dictionary files")(column 5, lines 33 to 36: Figure 2). 

Regarding claim 15, Ferrell discloses vocabulary library 68 ("specific 
pronunciation files") includes digital representations of vocabulary elements (column 7, 
lines 40 to 45: Figure 3). 

Regarding claim 20, Ferrell discloses icon 84 provides visual feedback in the 
form of a confidence meter, which indicates the correctness of a user response (column 
8, lines 1 to 3: Figure 4). 



The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 



Claim Rejections - 35 USC § 103 
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Claims 2 to 4 and 16 to 18 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Ferrell in view of Henton. 

Concerning independent claim 16, Ferrell discloses an interactive language 
training system, comprising: 

"a first module configured to convert input text to audible speech in a selected 
language, the audible speech indicative of a model" - speech synthesizer 74 forms an 
audio representation of the vocabulary elements; vocabulary library 68 includes 
recorded digitized representations of vocabulary elements ("models") (column 7, lines 
40 to 45: Figure 3); a vocabulary element, such as a word or phrase is presented both 
visually and aurally to the individual in a native language or a non-native language 
(column 4, lines 33 to 57: Figure 1); 

"a user interface configured to receive utterances spoken by a user in response 
to a prompt to replicate the audible speech" - a vocabulary element is presented both 
visually and aurally to the individual ("a prompt"), and the individual is given a period of 
time to initiate a response; the user's response is received; for example, the user may 
pronounce the vocabulary element (column 4, line 44 to column 5, line 10: Figure 1 : 
Steps 12 to 14; Figure 4); 

"a third module configured to recognize the utterances and provide feedback to 
the user, the feedback being comprised of at least one of a score, an icon and an audio 
segment reflecting a precision at which the user replicates the audible speech in the 
selected language based on a comparison of the utterances to one of the audible 
speech and the model" - the responses are evaluated for correctness and appropriate 
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feedback is presented to the user based on the correctness of the response; in the 
preferred embodiment, the feedback includes both visual and aural feedback; visual 
feedback is provided by a needle gauge at the bottom of the screen which indicates the 
degree of correct pronunciation ("confidence measure")(column 5, lines 8 to 25: Figure 
1 ; Steps 18 and 20); icon 84 provides visual feedback in the form of a confidence meter 
which indicates the correctness of a user response (column 8, lines 1 to 3: Figure 4). 

Ferrell discloses visually displayed vocabulary elements ("input text" )(Figure 4), 
but omits: 

"a second module synchronized to the first module, the second module 
producing an animated image of a human face and head pronouncing the audible 
speech." 

However, Henton teaches a method and apparatus for synthetic speech with an 
animated face, suggesting that it is well known to synchronize imaging of a face with 
synthetic speech for the purpose of instructing the user. (Column 3, Lines 33 to 49: 
Figure 3) It would have been obvious to one of ordinary skill in the art to include an 
animated face module in Ferrell that synchronizes imaging with synthetic speech as 
taught by Henton because it is well known to utilize an animated face synchronized with 
synthetic speech for the purpose of simulating an instructor. 



Concerning independent claim 17, Ferrell discloses an interactive language 
training method, comprising: 
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"converting input text data to audible speech data" - speech synthesizer 74 
forms an audio representation of the vocabulary elements (column 7, lines 40 to 45: 
Figure 3); 

"generating audible speech comprising phonemes based on the audible speech 
data" - a vocabulary element may be a phoneme ("phoneme model") (column 4, lines 
44 to 47: Figure 1: Steps 12 to 14); 

"outputting the audible speech through an audio output device" - aural 
presentation of vocabulary elements utilizes speakers 76 (column 7, lines 37 to 39: 
Figure 3); 

"prompting the user to replicate the audible speech" - a vocabulary element is 
presented both visually and aurally to the individual ("a prompt"), and the individual is 
given a period of time to initiate a response; the user's response is received; for 
example, the user may pronounce the vocabulary element (column 4, line 44 to column 
5, line 10: Figure 1: Steps 12 to 14; Figure 4); 

"recognizing utterances generated by the user in response to the prompting" - 
speech recognition device 70 utilizes microphone 72 to capture and analyze audio input 
from the user (column 7, lines 17 to 19: Figure 3); 

"comparing the audible speech to the utterances" - the responses are evaluated 
for correctness (column 5, lines 1 to 10; Figure 1; Step 18); 

"providing feedback to the user based on the comparison, the feedback 
comprised of at least one of a score, an icon and an audio segment reflecting a 
precision at which the user replicates the audible speech" - in the preferred 
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embodiment, the feedback includes both visual and aural feedback; visual feedback is 
provided by a needle gauge at the bottom of the screen which indicates the degree of 
correct pronunciation ("confidence measure")(column 5, lines 8 to 25: Figure 1; Steps 
18 and 20); icon 84 provides visual feedback in the form of a confidence meter which 
indicates the correctness of a user response (column 8, lines 1 to 3: Figure 4). 

Ferrell discloses visually displayed vocabulary elements ("input text")(Figure 4), 
but omits: 

"generating an animated image of a face and head pronouncing the audible 
speech" and "synchronizing the audible speech and the video." 

However, Henton teaches a method and apparatus for synthetic speech with an 
animated face, suggesting that it is well known to synchronize imaging of a face with 
synthetic speech for the purpose of instructing the user. (Column 3, Lines 33 to 49: 
Figure 3) It would have been obvious to one of ordinary skill in the art to include an 
animated face module in Ferrell that synchronizes imaging with synthetic speech as 
taught by Henton because it is well known to utilize an animated face synchronized with 
synthetic speech for the purpose of simulating an instructor. 

Concerning claim 2, similar considerations apply. 

Concerning claim 3, Henton teaches a face and head, which is a "transparent" 
line drawing (Figure 3). 

Concerning claim 4, Ferrell must implicitly include at least a volume control for 
speakers 76. 
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Concerning claim 18, Ferrell discloses lessons (Figure 2) and vocabulary library 
68 (Figure 3); these are "stored lesson files" as software in memory of processor 60 
(column 7, lines 14 to 25: Figure 3). 

Claims 8 and 11 to 14 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Ferrell in view of Mostow et al. 

Concerning claim 8, Ferrell does not expressly disclose that the input text is 
based on data received from a source outside the system. However, Mostow et al. 
teaches a related reading and pronunciation tutor involving speech recognition, where 
an external application such as a tutor for another domain, may dynamically supply text 
for the tutor to help the user to read. (Column 8, Lines 59 to 61 : Figure 1 ) It would have 
been obvious to supply the input text from a source outside the system in the interactive 
language instruction system of Ferrell as suggested by Mostow et al. for the purpose of 
providing more flexibility in lesson content. 

Concerning claims 1 1 to 13, Ferrell omits tables storing mapping data between 
word subgroups and vocabulary words, between words and vocabulary words, and 
between words and examples of parts of speech. However, Mostow et al. teaches a 
related reading and pronunciation tutor where an automatic enhancement function 
includes a heuristic algorithm using tables. Lookup of information in tables identifies 
sets of words that rhyme with one another, words that look alike, start or end the same 
etc., by constructing a key for each word that says what set is that word's equivalence 
class. The word may also be decomposed into its root word and affixes, which implicitly 
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involves identification of the word's part of speech (Column 9, Line 52 to Column 10, 
Line 33) It would have been obvious to one of ordinary skill in the art to include tables 
of related words as taught by Mostow et al in the interactive language instruction 
system of Ferrell for the purpose of inferring the pronunciation of words not found in a 
dictionary. 

Concerning claim 14, Ferrell omits tables of punctuation, but Mostow et al. 
teaches that the tutoring function takes account of phrase boundaries as indicated by 
commas and certain other punctuation for the purpose of more accurately aligning 
recognition results against the text. (Column 5, Lines 1 1 to 22) It would have been 
obvious to one of ordinary skill in the art to include a table of punctuation indicating 
phrase boundaries in the interactive language instruction system of Ferrell for the 
purpose of more accurately aligning recognition results against the text as taught by 
Mostow et al 

Claims 10 and 19 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Ferrell in view of Henton as applied to claim 17 above, and further in view of 
Adams, Jr. et al. 

Ferrell omits a record and playback module for providing playback of selected 
portions of audible speech and utterances from the user. However, Adams, Jr. et al. 
teaches a related system and method for interactive reading and language instruction 
including a session database for replay and resumption containing all the information 
necessary to provide a replay of the joint reading of the text by the companion and the 



Application/Control Number: 09/392,844 



Page 1 1 



Art Unit: 2654 

student. (Column 4, Lines 17 to 29: Figure 2) Throughout the lesson the audio inputs 
from both the student and the computer instructor, along with the text as displayed for 
utterance by each party, are stored at the session database. Adams, Jr. et al. suggests 
that this enhances the learning experience by identifying areas for concentrated effort in 
the future. (Column 7, Lines 45 to 51 ) It would have been obvious to one of ordinary 
skill in the art to include a record and playback module in the system and method for 
interactive language training of Ferrell as suggested by Adams, Jr. et al. for the purpose 
of enhancing the lesson learning experience by identifying areas for concentrated effort. 



This is a Continued Prosecution Application of Applicants 1 earlier Application 
No.09/382,844. All claims are drawn to the same invention claimed in the earlier 
application and could have been finally rejected on the grounds and art of record in the 
next Office action if they had been entered in the earlier application. Accordingly, THIS 
ACTION IS MADE FINAL even though it is a first action in this case. See MPEP 
§ 706.07(b). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
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extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no, however, event will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lerner whose telephone number is (703) 308- 
9064. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Marsha Banks-Harold can be reached on (703) 305-4379. The fax phone 
numbers for the organization where this application or proceeding is assigned are (703) 
872-9314 for regular communications and (703) 872-9314 for After Final 
communications. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the receptionist whose telephone number is (703) 305- 
4700. 
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SUPERVISORY PATENT EXAMINER 
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